Language identification and IT

نویسنده

  • Peter Constable
چکیده

Many processes used within information technology need to be customized to work for specific languages. For this purpose, systems of tags are needed to identify the language in which information is expressed. Various systems exist and are commonly used, but all of them cover only a minor portion of languages used in the world today, and technologies are being applied to an increasingly diverse range of languages that go well beyond those already covered by these systems. Furthermore, there are several other problems that limit these systems in their ability to cope with these expanding needs. This paper examines five specific problem areas in existing tagging systems for language identification, and proposes a particular solution that covers all the world’s languages while addressing all five problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مقایسه روش های طیفی برای شناسایی زبان گفتاری

Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...

متن کامل

Offline Language-free Writer Identification based on Speeded-up Robust Features

This article proposes offline language-free writer identification based on speeded-up robust features (SURF), goes through training, enrollment, and identification stages. In all stages, an isotropic Box filter is first used to segment the handwritten text image into word regions (WRs). Then, the SURF descriptors (SUDs) of word region and the corresponding scales and orientations (SOs) are extr...

متن کامل

The Effect of English Vowel-Recognition Training on Beginner and Advanced Iranian ESL Learners

This study was an attempt to investigate the effect of vowel-recognition training on beginner and advanced Iranian ESL learners. A total of 36 adult Iranian ESL learners (18 advanced and 18 beginners) who were students of various majors at Memorial University (MUN) were recruited for the study. Advanced participants had the experience of living in Canada for at least three years while beginners...

متن کامل

Identification and Distribution of Interactional Contexts in EFL Classes: The Effect of Two Contextual Factors

This study aims at empirically furthering awareness of the organization of interaction in EFL classes. Informed by the methodological framework of conversation analysis, it draws upon a corpus of 52 three-hour naturally-occurring classroom interaction to identify classroom interactional contexts based on the structuring of the pedagogic goals in turn-taking sequences. Conversation analytic proc...

متن کامل

Finite element model updating of bolted lap joints implementing identification of joint affected region parameters

<span style="color: black; font-family: 'Times New Roman','serif'; font-size: 10pt; mso-fareast-font-family: 'Times New Roman'; mso-themecolor: text1; mso-ansi-lang...

متن کامل

Cultural Adaptation of Sniffin’ Sticks Smell Identification Test: The Malaysian Version

Introduction: Sniffin’ Sticks smell identification test is a tool used for evaluation of olfactory function but the results are culture-dependent. It relies on the subject’s familiarity to the odorant and descriptors. This study aims to develop the Malaysian version of Sniffin’ Sticks smell identification test suitable for local population usage. Materials and Methods:   The o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000